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ABSTRACT 

The Machine Aided Voice Translation (MAVT) system was developed in response to the shortage of 
«F«nenced military field interrogators with both foreign language proficiency andinterrogation skilfs Combining 
machui ® translation, and speech generation technologies, the MAVT accepts an interrogator's % 
sjroken English question and translates it into spoken Spanish. The spoken Spanish response of the potential 
udonnant can then be translated into spoken English. Potential mildly and civilian apKoL foSStic 
spoken language translation technology are discussed in this papier. 

THE MACHINE AIDED VOICE TRANSLATION (MAVT) SYSTEM 

.* P""" 8 Umes of mil ‘tary conflict it is important to acquire intelligence information quickly The best sources of 

umely information, however, are often foreign-language speaking people: defectors 

Prisoners of War, and civilians from the conflict area. Whenever ^wherever c^fl^occ^SS u3s 
who are versed in the particular language and dialect of potential informants, familiar with the Commander's 

OW edgeable abOUt interrogation techniques are a valuable asset - and an extremely rare 

The Machine Aided Voice Translation (MAVT) system is an early prototype demonstration of the application 
of current speech processing technology to help compensate for the shortage of suitably trained and experienced 
fo.S!fc, 1 allowsa,ess silled interrogator to "screen" potential informants. When an interrogator wkh littleor no 
language skills asks questions by speaking into the microphone in English, the MAVT translates the 
q tions into machine-spoken Spanish. Upon hearing each question in his/her native language the potential 
formant responds by speaking into the microphone and the system translates the response imo’spoken English 
Based on the interrogator s perception of an informant's cooperativeness, reliability of informatioirand relevance of 

£ 

w aSp^ish-speaking mterrogatee. Providing the MAVT with knowledge of the gender of theSker alio™ Suer 
speech recognition due to the more appropriate use of either a male or female speech "modeU 

n P uts to the system must be restricted to those that use words from the system’s Spanish and Enelish 
^cuonanes (lexicons) and those which use a word order allowed by its grammars. The^e two NUVT grammars 
a W H < ?'t e H t,0nS and answe , rs about blogra P hlcaI information, so examples in English are- "What is your 

sTScSa "* n the^o U H Unit deSlgnaUOn '' examples in Spanish " Mi ran 8° es teniente general" and "Naci' en 
. ' e second grammar focuses on mission-related information such as (in English ) "Whv was vour 
unit moving out to the south?" and "Is the main force heading in that direction?", and (in Spanish) "Proteger el 
Si°Ki e ?°" 1a ? do del rcg'miento" and "Su misio’n es encontrar unidades americanas." The display includes a 
scrollab e list of examples of the inputs that are accepted by the speech recognizer. When a Spanish shaker is 
anticipated the display lists examples of acceptable Spanish responses. P Speaker ,s 

protot ypc * s serv * n 8 a- s die design basis for a follow-on development that will extend the English 
^ aa '^ h 1 ^ anslatl ° n pabulary and expand system capabilities to include English-Arabic and English-Russian 
spoken language translations. The follow-on system will be completed in late 1996 

„ >“ gu ? ge * ysu : ms Incorporated, of Woodland Hills, California, developed the MAVT and is the nrimarv 
contractor for the advanced development model primary 


fSSI^^n^T^ 011 SyStCm ° f MAVT is lhe PE20 ° Phonetlc Engine produced by Speech Systems Inc 
| r J.' The Phonetl c Engine accepts speaker-independent, continuous speech. That is, it does not have to be trained 

as ™ any pa ^ ular v01 f e> 311(1 users can speak quite naturally and fluidly without pausing between each word 

as would be required by an isolated word speech recognizer. pausing oerween eacn word 

* English and Spanish examples are not intended to represent questions and their respective answers. 
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Figure 1 
MAVT Display 





Conjoradon SSl by 3 DEC,alk DTC0 ' spcech *'" en “ or *»» “* Digital Equipment 

,atc*of the DEc2'C °“ P “ and COnVe,1S “ sp< * en words ' ^ P“ h “ d *“*"« 

The core, language translation software of the MAVT is LSI’s DBG natural language processing system hosted 
on a Sun Microsystems Woricstation (SPARCstation). DBG was extended for this project withTmulSgl 
lexicon, a multdingual morphological component, and a language-independent syntactic parser. DBG worics by 
deriving semantic (meaning) representations of inputs and translating them into the language which is to be output. 



Figure 2 

MAVT Architecture 


DUAL-USE OF MAVT TECHNOLOGY 

milit^nn^Hnn? a PP lications °f automatic spoken language translation include its use for multi-national 

military operations - facilitating communication among cooperative multi-lingual forces, and its use for deriving 
mtelhgence mformauon from speech communications. This latter application requires a burdensome mix of & 
knowledge-intensive skills similar to those involved in military interrogation since the analyst, or listener must not 
^rtin^n?m e rh° hS — ^ r ? u,Uple 1,n , es of communication input and responsibly select and record information 
~unic^ns m,SS ‘ 0n ’ mUSt P ° SSeSS forei 8 n ~ Ian guage skills when monitoring foreign-language 

Hr a P pllcation of computer-based spoken language translation technology to language training can provide 

civui^^uden^ 8 ^m^n^ n conwTe n ^ 0rC tT ent 0f , foreign langU3ge SkiHs for ^P^eied military linguL and 
nr d ^ , Students c ° uld requcst language equivalents for spoken expressions in their native tongue 

ait^i^M™ 8 ? Ph , raSeS dlsplayed on a XTZcn or voiced' by the computer. The correctness of a student’s 

attempt to speak in a foreign language would be determined by matching his utterance to the expected input Instant 
reinforcement can be provided and intelligent computer prompting could guide students throughdfficultVrases 
Adding odter media, vtdeo for instance, to such a training system further extends the value SeTechnology ^a 
resource because students, cued by visual prompts, can then devise their own wording. Further computer- 
ed ^ ag v e r traimng ? WS essons 10 be . var,ed depending on a student's level of competence. 

e Prototype has attracted the interest of law enforcement organizations and emergency room medical 

personnel. Most large metropolitan areas in the United States have many non-English spealdngTnSS mline 

I^T n,Catl0n acqu,s,tion ^ information difficult in some cases and almost impossible in others As a result of 
these communication problems law enforcement organizations such as the Los Angeles Police Department have 

^ ' nterest ^ use of automatlc spoken language translation technology for interviewingTrime 
admS^of 2nd sus P ects ’ Hos P ltal staffers have noted the value the technology holds for emergency room 
admittance of foreign-speaking patients. In a manner similar to the military application for interrogation fcn gY ee 
ns ation technology would allow law enforcement personnel to communicate with citizens, and hospital personnel 
^MAVT^erhno^ “ lhe,r own languages without the time delay involved in locating an interpreter 

? h u gy 2180 fi " d8 a PP llcatlon ^ a diplomatic, or business, briefing aid. Lack of a common language 
reed not stand in the way in the future, of international visits and communications among those with common 8 8 
political or business acUvities. Interest in the technology has been shown by a Texi oraanSfio^gen J freilitate 
diplomatic mteracuons with representatives of the Mexican government. organization eager to facilitate 

dle value ° f thc techno, °gy as a tourist travel aid. Currently, words or simple phrases can be 
2S n n ^ and Y d instrumcnts 11,31 provide foreign-language translations, but how much better /would be to 
speak into a similar instrument and have it vocalize our intent in the language, or languages, of othera 
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TECHNOLOGY LIMITATIONS 

The current state of the component technologies of the MAVT places limitations on the near-term application of 
computer-based spoken language translation. Some of the deficiencies of those technologies are identified here. 

Speech recognition technology has improved substantially in just the last three years. Until very recently 
speaker independent continuous speech was still an out-of-range laboratory research goal, and the available isolated 
word speech recognition was not appropriate for very many applications. Current recognition capabilities will 
suffice, however, only for those applications for which every spoken input can be anticipated. Broader application 
of the technology will at least require the allowance of less restrictive, or more relaxed, grammars. In the meantime, 
the technology continues to advance rapidly. 

Limitations set by the state of speech recognition technology on the breadth of the grammar actually make its 
use with text-based language translation viable since automatic translation technology is not yet capable of handling 
free-form text. Since every input is anticipated, correct translation of every possible input can fairly well be assured. 
Improvements in machine translation of text will be required in the future. 

Broader application of speech translation technology will necessitate an improvement in intonation features 
offered by language generation systems. Computer-generated speech is currently robotic and monotone. 

And finally, MAVT components are specifically tailored to operate with a specific language pair and within a 
specific domain. Tools to port spoken language translation technology to new domains and to new languages are 
needed. 
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